Model gradient determining methods, apparatuses, devices, and media based on federated learning

ABSTRACT

Implementations include obtaining data volume information indicating an amount of data used by a participating node to train, based on local data, a basic training model, and where the local data includes user data of a target organization corresponding to the participating node. Based on the local data and by the participating node, obtaining a node local gradient by training the basic training model. Based on the data volume information and the node local gradient, determining a global gradient of a federated learning model that the participating node participates in. Based on the node local gradient of the participating node and the global gradient, determine a degree of participation of the participating node, where the degree of participation indicates a degree of participation of the participating node in federated learning model training. Based on the degree of participation, determine an actual model gradient of the participating node.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to Chinese Patent Application No.202210399999.6, filed on Apr. 15, 2022, which is hereby incorporated byreference in its entirety.

TECHNICAL FIELD

This application relates to the field of computer technologies, and inparticular, to model gradient determining methods, apparatuses, devices,and media based on federated learning.

BACKGROUND

As the society becomes more and more aware of the necessity of dataprivacy, emerging laws such as the General Data Protection Regulation(GDPR) impose strict restrictions on data circulation, leading to anincreasingly severe problem of data islands. As an emerging privacyprotection distributed machine learning method, federated learningprovides a new idea for alleviating the problem of data islands.However, a plurality of data owners participate in current federatedlearning training to obtain a unified training model so that modelsobtained by participants are the same, and it is unable to provideparticipants with different models based on different scenarios.

Therefore, how to provide different models for different participants infederated learning is an urgent technical problem to be alleviated.

SUMMARY

Embodiments of this specification provide model gradient determiningmethods, apparatuses, devices, and media based on federated learning, toalleviate a problem that models obtained by participants in existingfederated learning are consistent.

To alleviate the previous technical problem, the embodiments of thisspecification are implemented as follows:

Some embodiments of this specification provide a model gradientdetermining method based on federated learning, including: obtainingdata volume information of a participating node, where the data volumeinformation is used to indicate an amount of data used by theparticipating node to train a basic training model based on local data,and the local data includes user data of a target organizationcorresponding to the participating node; obtaining a node local gradientobtained by training the basic training model based on the local data bythe participating node; determining, based on the data volumeinformation and the node local gradient, a global gradient of afederated learning model that the participating node participates in;determining a degree of participation of the participating node based onthe node local gradient of the participating node and the globalgradient, where the degree of participation is used to indicate a degreeof participation of the participating node in federated learning modeltraining; and determining an actual model gradient of the participatingnode based on the degree of participation.

Some embodiments of this specification provide a model gradientdetermining apparatus based on federated learning, including: a volumeinformation acquisition module, configured to obtain data volumeinformation of a participating node, where the data volume informationis used to indicate an amount of data used by the participating node totrain a basic training model based on local data, and the local dataincludes user data of a target organization corresponding to theparticipating node; a local gradient acquisition module, configured toobtain a node local gradient obtained by training the basic trainingmodel based on the local data by the participating node; a globalgradient determining module, configured to determine, based on the datavolume information and the node local gradient, a global gradient of afederated learning model that the participating node participates in; adegree-of-participation determining module, configured to determine adegree of participation of the participating node based on the nodelocal gradient of the participating node and the global gradient, wherethe degree of participation is used to indicate a degree ofparticipation of the participating node in federated learning modeltraining; and an actual gradient determining module, configured todetermine an actual model gradient of the participating node based onthe degree of participation.

Some embodiments of this specification provide a model gradientdetermining device based on federated learning, including: at least oneprocessor; and a memory communicatively connected to the at least oneprocessor, where the memory stores an instruction that can be executedby the at least one processor, and the instruction is executed by the atleast one processor so that the at least one processor can: obtain datavolume information of a participating node, where the data volumeinformation is used to indicate an amount of data used by theparticipating node to train a basic training model based on local data,and the local data includes user data of a target organizationcorresponding to the participating node; obtain a node local gradientobtained by training the basic training model based on the local data bythe participating node; determine, based on the data volume informationand the node local gradient, a global gradient of a federated learningmodel that the participating node participates in; determine a degree ofparticipation of the participating node based on the node local gradientof the participating node and the global gradient, where the degree ofparticipation is used to indicate a degree of participation of theparticipating node in federated learning model training; and determinean actual model gradient of the participating node based on the degreeof participation.

Some embodiments of this specification provide a computer-readablemedium, storing a computer-readable instruction, where when thecomputer-readable instruction can be executed by a processor toimplement a model gradient determining method based on federatedlearning.

Embodiments of this specification can achieve the following beneficialeffects: in the embodiments of this specification, when federatedlearning is performed, the data volume information used by theparticipating node to train the basic training model based on the localdata as well as the node local gradient provided by the participatingnode can be obtained, the global gradient of the federated learningmodel can be determined, the degree of participation of theparticipating node can be determined based on the node local gradientand the global gradient, and then the actual model gradient available tothe participating node can be determined based on the degree ofparticipation. In this way, different model gradients can be assigned todifferent participating nodes in the federated learning process so thatdifferent federated learning models can be obtained for differentparticipating nodes.

BRIEF DESCRIPTION OF DRAWINGS

To describe the technical solutions in the embodiments of thisspecification or in the existing technology more clearly, the followingbriefly describes the accompanying drawings needed for describing theembodiments or the existing technology. Clearly, the accompanyingdrawings in the following description merely show some embodiments ofthis application, and a person of ordinary skill in the art can stillderive other drawings from these accompanying drawings without creativeefforts.

FIG. 1 is a schematic diagram illustrating an application scenario of amodel gradient determining method based on federated learning accordingto some embodiments of this specification;

FIG. 2 is a schematic flowchart illustrating a model gradientdetermining method based on federated learning according to someembodiments of this specification;

FIG. 3 is a swimlane diagram illustrating a model gradient determiningmethod based on federated learning according to some embodiments of thisspecification;

FIG. 4 is a schematic structural diagram of a model gradient determiningapparatus based on federated learning according to some embodiments ofthis specification; and

FIG. 5 is a schematic structural diagram of a model gradient determiningdevice based on federated learning according to some embodiments of thisspecification.

DESCRIPTION OF EMBODIMENTS

To make the objectives, technical solutions, and advantages of one ormore embodiments of this specification clearer, the following clearlyand comprehensively describes the technical solutions of the one or moreembodiments of this specification with reference to specific embodimentsand accompanying drawings of this specification. Clearly, the describedembodiments are merely some rather than all of the embodiments of thisspecification. Other embodiments obtained by a person of ordinary skillin the art based on the embodiments of this specification withoutcreative efforts shall fall within the protection scope of one or moreembodiments of this specification.

The following describes in detail the technical solutions in theembodiments of this specification with reference to the accompanyingdrawings.

In the existing technology, federated learning usually needs eachparticipant to obtain a model from a server; then each participant useslocal data to train the model, and uploads obtained model gradient datato the server. The server aggregates each piece of gradient data toobtain an updated model parameter or gradient, and then sends theupdated model parameter or gradient to each participant so that eachparticipant can obtain a unified model obtained through federatedlearning.

To alleviate the defects in the existing technology, the technicalsolutions of this specification provide the following embodiments:

FIG. 1 is a schematic diagram illustrating an application scenario of amodel gradient determining method based on federated learning accordingto some embodiments of this specification. As shown in FIG. 1 , thesolutions can include the following: participating nodes 1 and a server2 participating in federated learning. The participating nodes 1participating in the federated learning can include a plurality ofparticipating nodes, and the server 2 can obtain model gradientsprovided by the plurality of participating nodes, and then performaggregation to obtain a federated learning model. In the embodiments ofthis specification, the participating nodes 2 can train a basic trainingmodel based on local data of the participating nodes, and feed backmodel parameters of the trained model or node local gradients to theserver; and the server 2 can further obtain data volume information oftraining data collected by the participating nodes during modeltraining, and obtain a global gradient of the federated learning modelbased on the data volume information of the participating nodes and themodel parameters or the node local gradients provided by theparticipating nodes. In the embodiments of this specification, theserver 2 can further determine degrees of participation of theparticipating nodes in the federated learning based on the node localgradients of the participating nodes 1 and the global gradient, and thendetermine actual model gradients matching the participating nodes basedon the degrees of participation.

Next, a model gradient determining method based on federated learningprovided in some embodiments of this specification is described indetail with reference to the accompanying drawings:

FIG. 2 is a schematic flowchart illustrating a model gradientdetermining method based on federated learning according to someembodiments of this specification. From a program perspective, theprocedure can be executed by a program or an application client mountedon an application server.

As shown in FIG. 2 , the procedure can include the following steps:

Step 202: Obtain data volume information of a participating node, wherethe data volume information is used to indicate an amount of data usedby the participating node to train a basic training model based on localdata, and the local data includes user data of a target organizationcorresponding to the participating node.

Step 204: Obtain a node local gradient obtained by training the basictraining model based on the local data by the participating node.

In the embodiments of this specification, a participating node can beany one of a plurality of participating nodes participating in federatedlearning. The participating node can represent various organizationsparticipating in the federated learning, including organizations thatprovide services such as resource processing, payment, leasing, andonline and offline transactions. The user data can include user dataretained by a user in the target organization. For example, the userdata can include service data and transaction records of the userhandling services in the target organization, can also includeregistration information provided by the user to the targetorganization, and can also include browsing records of the user in thetarget organization.

The participating node can train the basic training model based on dataobtained by the participating node, to obtain the node local gradient,and send the node local gradient and the data volume informationparticipating in the training to the server. The data volume informationcan include an amount of data indicating training data used for modeltraining, and the training data does not need to be sent to the server.It can be ensured that data of the participating node does not leave thedomain, and security of the data can be improved.

The basic training model can represent the latest training modelexisting in the participating node. In practice, the basic trainingmodel can be a model to be trained provided by the server to theparticipating node. The federated learning can use a plurality of roundsof iterative learning. For the first round of training, the server cansend the basic training model to each participating node; after theparticipating node uses local data to train the model, the trained modelor a model parameter can be temporarily stored. In subsequent rounds oftraining, a model obtained after previous rounds can be updated andtrained. In this case, the basic training model can represent the modelobtained after the previous rounds of training.

Step 206: Determine, based on the data volume information and the nodelocal gradient, a global gradient of a federated learning model that theparticipating node participates in.

In practice, in the federated learning, the global gradient is usuallyobtained just by aggregating the node local gradient provided by theparticipating node. In the embodiments of this specification, the globalgradient of the federated learning model can be obtained with referenceto the data volume information of the participating node. Accuracy ofthe global gradient can be improved.

Step 208: Determine a degree of participation of the participating nodebased on the node local gradient of the participating node and theglobal gradient, where the degree of participation is used to indicate adegree of participation of the participating node in federated learningmodel training.

In practice, due to the impacts of a computing capability of aparticipating node or quality or an amount of training data, differentparticipating nodes may have different degrees of participation in thefederated learning process. For example, the node local gradientprovided by the participating node that uses training data with betterquality and a larger data volume for model training has a greaterinfluence on the global gradient, and the participating node can have ahigher degree of participation.

Step 210: Determine an actual model gradient of the participating nodebased on the degree of participation.

The server can send the determined actual model gradient of theparticipating node to a corresponding participating node so that theparticipating node can obtain a federated learning model that matchesthe participating node. In practice, the server can further determinethe federated learning model that matches the participating node or amodel parameter based on the determined actual model gradient of theparticipating node, and can also send the federated learning model orthe model parameter to the participating node. The federated learningmodel can include a model for evaluation, for example, a risk evaluationmodel, a reputation evaluation model, or a profit and loss evaluationmodel.

In the embodiments of this specification, when federated learning isperformed, the data volume information used by the participating node totrain the basic training model based on the local data as well as thenode local gradient provided by the participating node can be obtained,the global gradient of the federated learning model can be determined,the degree of participation of the participating node can be determinedbased on the node local gradient and the global gradient, and then theactual model gradient available to the participating node can bedetermined based on the degree of participation. In this way, differentmodel gradients can be assigned to different participating nodes in thefederated learning process so that different federated learning modelscan be obtained for different participating nodes.

It should be understood that in the methods described in one or moreembodiments of this specification, the order of some steps can beexchanged based on actual needs, or some steps can be omitted ordeleted.

Based on the method in FIG. 2 , the embodiments of this specificationfurther provide some specific implementations of the method, which aredescribed below.

To improve accuracy of the global gradient in the federated learning,data quality of participating nodes can also be incorporated intodetermining of the global gradient in the embodiments of thisspecification. Optionally, in the embodiments of this specification,after the obtaining a node local gradient obtained by training the basictraining model based on the local data by the participating node, themethod can further include the following: obtaining a marginal loss ofthe participating node, where the marginal loss is used to represent adegree of influence of the node local gradient of the participating nodeon performance of the federated learning model; and determining nodemass of the participating node based on the marginal loss; and thedetermining, based on the data volume information and the node localgradient, a global gradient of a federated learning model that theparticipating node participates in can specifically include thefollowing: determining, based on the data volume information, the nodelocal gradient, and the node mass, the global gradient of the federatedlearning model that the participating node participates in.

The marginal loss can represent a loss of the federated learning modelthat increases with an increase of participating nodes in the federatedlearning, and can represent a degree of influence of a node localgradient of a participating node on performance of the federatedlearning model.

In the embodiments of this specification, the participating nodeparticipating in the federated learning model can include a plurality ofparticipating nodes. The obtaining a marginal loss of the participatingnode can specifically include the following: determining a firstreference global model based on a node local gradient of eachparticipating node in the plurality of participating nodes; determininga second reference global model based on a node local gradient of eachparticipating node other than the participating node in the plurality ofparticipating nodes; determining a first model loss of the firstreference global model based on a predetermined verification set;determining a second model loss of the second reference global modelbased on the predetermined verification set; and determining themarginal loss of the participating node based on the first model lossand the second model loss.

As an implementation, assume that in each round t of the federatedlearning, M_(G) ^((t)) is used to represent a first global modelobtained by aggregating the node local gradients provided by theparticipating nodes, and M_(−i) ^((t)) is used to represent a secondglobal model obtained by aggregating node local gradients ofparticipating nodes except a participating node i; and l^((t)) andl_(−i) ^((t)) respectively represent losses of the models M_(G) ^((t))and M_(−i) ^((t)) on the validation set. Then, a marginal loss δ_(i)^((t)) can be δ_(i) ^((t))=l_(−i) ^((t))−l^((t)).

Here,

${M_{G}^{(t)} = {M_{G}^{({t - 1})} + {\frac{1}{N}{\sum\limits_{j \in W}u_{j}^{(t)}}}}};{M_{- i}^{(t)} = {M_{G}^{({t - 1})} + {\frac{1}{N - 1}{\sum\limits_{j \in w^{\prime}}u_{j}^{(t)}}}}};$

M_(G) ^((t-1)) represents a federated learning model in a (t-1)^(th)round of the federated learning; N represents a total quantity ofparticipating nodes participating in the federated learning; Wrepresents a set of the participating nodes participating in thefederated learning, and j represents any participating nodeparticipating in the federated learning;

$\sum\limits_{j \in W}u_{j}^{(t)}$

represents the sum of node local gradients of participating nodes in thefederated learning; w′ represents a set of participating nodes exceptthe participating node i among participating nodes participating in thefederated learning; and

$\sum\limits_{j \in w^{\prime}}u_{j}^{(t)}$

represents the sum of node local gradients of participating nodes otherthan participating node i.

In practice, the first model loss l^((t)) of the first reference globalmodel M_(G) ^((t)) and the second model loss l_(−i) ^((t)) of the secondreference global model M_(−i) ^((t)) can be calculated by using thepredetermined verification set based on a loss function.

In the embodiments of this specification, a larger marginal loss δ_(i)^((t)) of the participating node i indicates a more important localgradient of the participating node i in the federated learning.Optionally, in the embodiments of this specification, the determiningnode mass of the participating node based on the marginal loss canspecifically include the following: determining the node mass of theparticipating node based on a marginal loss of each participating nodein the plurality of participating nodes and a normalization algorithm.

In the embodiments of this specification, node mass m_(i) ^((t)) of theparticipating node i in the t^(th) round of the federated learningtraining can be expressed as

${m_{i}^{(t)} = \frac{w_{o}^{(t)} + w_{i}^{(t)}}{\Sigma_{i}\left( {w_{o}^{(t)} + w_{i}^{(t)}} \right)}};w_{o}^{(t)}$

represents a constant in the t^(th) round, and

${w_{i}^{(t)} = \frac{\delta_{i}^{(t)} - {\min_{i}\left( \delta_{i}^{(t)} \right)}}{\Sigma_{i}\left( {\delta_{i}^{(t)} - {\min_{i}\left( \delta_{i}^{(t)} \right)}} \right)}};{\min_{i}\left( \delta_{i}^{(t)} \right)}$

represents the smallest marginal loss in marginal losses of theparticipating nodes in the t^(th) round of the training; and Σ_(i)(δ_(i)^((t))−min_(i)(δ_(i) ^((t)))) represents the sum of a difference betweena marginal loss of each participating node and the smallest marginalloss in the t^(th) round of the training.

The node mass m_(i) ^((t)) can also represent a weight of theparticipating node i in the federated learning, and a local gradientwith a larger marginal loss value can correspond to a greater weight. w₀^((t)) can be a positive number, and it can ensure that the node massm_(i) ^((t)) is not equal to 0.

To obtain a more effective global gradient, in the embodiments of thisspecification, effective participating nodes can also be determinedbased on marginal losses, and the global gradient can be determinedbased on node local gradients of the effective participating nodes. Theembodiments can also make the aggregated model converge faster and havehigher accuracy.

Optionally, the method in the embodiments of this specification canfurther include the following: determining a participating node with amarginal loss greater than or equal to a predetermined loss threshold inthe plurality of participating nodes as an effective participating node;and the determining, based on the data volume information, the nodelocal gradient, and the node mass, the global gradient of the federatedlearning model that the participating node participates in specificallyincludes the following: performing an aggregation operation on a nodelocal gradient of each of the effective participating node based on datavolume information of the participating node and node mass of theparticipating node, to obtain the global gradient.

In the embodiments of this specification, a global gradient u_(G) ^((t))in the t^(th) round of the training can be obtained based on thefollowing equation:

${u_{G}^{(t)} = \frac{\Sigma_{i}m_{i}^{(t)}{❘D_{i}❘}u_{i}^{(t)}}{\Sigma_{i}m_{i}^{(t)}{❘D_{i}❘}}};$

i represents an effective participating node with a marginal lossgreater than or equal to the predetermined loss threshold in theparticipating node participating in the federated learning;D_(i)represents a data volume, of the participating node i, of trainingdata used for training based on local data; and Σ_(i) represents the sumof related data for effective participating nodes.

To improve effectiveness of data, node mass of the effectiveparticipating node can be calculated when calculating the node mass ofthe participating node in the embodiments of this specification, wherenode mass of each effective participating node is determined based on amarginal loss of each effective participating node and the normalizationalgorithm.

In the above equation

${m_{i}^{(t)} = \frac{w_{o}^{(t)} + w_{i}^{(t)}}{\Sigma_{i}\left( {w_{o}^{(t)} + w_{i}^{(t)}} \right)}},$

i can represent any node in effective participating nodes, min_(i)(δ_(i)^((t))) can represent the smallest marginal loss in marginal losses ofthe effective participating nodes in the t^(th) round of the training;and Σ_(i)(δ_(i) ^((t))−min_(i)(δ_(i) ^((t)))) represents the sum of adifference between a marginal loss of each effective participating nodeand the smallest marginal loss in the t^(th) round of the training.

In the embodiments of this specification, the participating nodeparticipating in the federated learning model include a plurality ofparticipating nodes, and the degree of participation of theparticipating node can be represented by a contribution degree of theparticipating node in the federated learning. The determining a degreeof participation of the participating node based on the node localgradient of the participating node and the global gradient canspecifically include the following: determining a node contributiondegree of each of the plurality of participating nodes based on a nodelocal gradient of the participating node and the global gradient; anddetermining a relative contribution degree of the participating nodebased on a node contribution degree of the participating node and thenode contribution degree of each participating node.

In the embodiments of this specification, distance mapping of the nodelocal gradients of the participating nodes to the global gradient can beused as a benefit function of contribution evaluation, and then thecontribution degrees of the participating nodes can be calculated byusing Shapley values. Specifically, in the t^(th) round of the training,a benefit function V(i) of the participating node i can be: V(i)=|u_(i)^((t))|cos (u_(i) ^((t)), u_(G) ^((t))). Then, the previously describedbenefit function can be substituted into calculation using the Shapleyvalues, to obtain a node contribution degree ϕ_(i) ^((t)):ϕ_(i)^((t))=α_(i) ^((t))|u_(i) ^((t))| cos (u_(i) ^((t)), u_(G) ^((t))) ofthe participating node i in the t^(th) round, where α_(i) ^((t))represents an aggregation weight of the participating node i in thet^(th) round.

In the embodiments of this specification, the Shapley values are used tocalculate the contribution degrees of the participating nodes, and thedistance mapping of the node local gradients of the participating nodeson the global gradient is used as the benefit function of the Shapleyvalues. The method satisfies the idea of contribution evaluation incooperative games, overcomes the disadvantage of using only accuracy inrelated work as a contribution, and has smaller computational overheadsthan those of real Shapley values, thereby saving resources andimproving computational efficiency.

To better reflect stability of the participating node in the entirefederated learning, in the embodiments of this specification, therelative contribution degree of the participating node in the entiretraining process can be further determined based on the contributiondegree of the participating node in each round of training. A nodecumulative contribution of each participating node can be determinedbased on a node contribution degree of each participating node; and arelative contribution degree of the participating node is determinedbased on the node cumulative contribution of the participating node andthe largest node cumulative contribution in node cumulativecontributions of participating nodes.

As an implementation, a relative contribution degree z_(i) ^((t)) of theparticipating node i in the t^(th) round of the training can beexpressed as:

${z_{i}^{(t)} = \frac{c_{i}^{(t)}}{\max_{i}\left( c_{i}^{(t)} \right)}},$

where

c_(i) ^((t)) represents a cumulative contribution of the participant ifrom the 1st round to the t^(th) round t, and c_(i) ^((t))=max(0,Σ_(i=1) ^(t)Ø_(i) ^((t))).

In the embodiments of this specification, the actual model gradient ofthe participating node can be determined based on the relativecontribution degree. A higher relative contribution degree of theparticipating node indicates an actual model gradient obtained by theparticipating node that is closer to the global gradient.

In the embodiments of this specification, reputations of participatingnodes can be further determined based on the historical behavior of theparticipating nodes, and can also be used to indicate reliability ortrustworthiness of the participating nodes. Fluctuations of theparticipating nodes in the federated learning can be smoothened, and thedegrees of participation of the participating nodes can be moreaccurately determined.

In the embodiments of this specification, the global gradient includes agradient obtained through a target quantity of rounds of iterativecalculations. The determining a degree of participation of theparticipating node can further include the following: obtaining atrustworthiness parameter of the participating node, where thetrustworthiness parameter is used to represent a comprehensive degree ofreliability of the participating node in the target quantity of roundsof calculation processes of the global gradient; and determining areputation degree of the participating node based on the relativecontribution degree and the trustworthiness parameter.

As an implementation, in the embodiments of this specification, areputation degree r_(i) ^((t)) of the participating node i in the t^(th)round based on r_(i) ^((t))=q_(i) ^((t))z_(i) ^((t)), where q_(i) ^((t))represents a trustworthiness parameter of the participating node i inthe t^(th) round, and z_(i) ^((t)) represents a relative contributiondegree of the participating node i in the t^(th) round.

In the embodiments of this specification, the obtaining atrustworthiness parameter of the participating node can specificallyinclude the following: determining a first quantity of times that theparticipating node is determined as an effective participating node inthe target quantity of rounds, where the effective participating noderepresents a participating node with a marginal loss greater than orequal to a predetermined loss threshold; determining a second quantityof times that the participating node is determined as an ineffectiveparticipating node in the target quantity of rounds, where theineffective participating node represents a participating node with amarginal loss less than the predetermined loss threshold; anddetermining the trustworthiness parameter of the participating nodebased on the first quantity of times and the second quantity of times.

As an implementation, in the embodiments of this specification,

q_(i)^((t)) = ae^(be^(cx_(i)^((t))))

can be used to represent the trustworthiness parameters q_(i) ^((t)) ofthe participating nodes, can represent degrees of trustworthiness of theparticipating nodes in the t^(th) round, and can also represent acomprehensive degree of reliability of the participating nodes in theentire federated learning for calculating the global gradient from the1st round to the t^(th) round, where

${x_{i}^{(t)} = \frac{{\beta n_{i}^{pass}} - {\left( {1 - \beta} \right)n_{i}^{fail}}}{{\beta n_{i}^{pass}} + {\left( {1 - \beta} \right)n_{i}^{fail}}}},$

β represents a constant coefficient, and can be set based on actualneeds, n_(i) ^(pass) represents a quantity of times that theparticipating node i is determined as an effective participating node inthe training process from the 1st round to the t^(th) round, and n_(i)^(fail) represents a quantity of times that the participating node i isdetermined as an ineffective participating node in the training processfrom the 1st round to the t^(th) round.

In the embodiments of this specification, the global gradient caninclude a plurality of global gradient factors, the participating nodeparticipating in the federated learning model can include a plurality ofparticipating nodes, and the global gradient can include the gradientobtained through the target quantity of rounds of iterativecalculations; and the determining an actual model gradient of theparticipating node based on the degree of participation can specificallyinclude the following: determining a quantity of matching gradientscorresponding to the participating node based on a ratio of thereputation degree of the participating node to the greatest reputationdegree, where the greatest reputation degree is used to represent thegreatest reputation degree in reputation degrees of the plurality ofparticipating nodes; and selecting global gradient factors of thequantity of matching gradients from the global gradient to obtain theactual model gradient.

In the embodiments of this specification, at least some of globalgradients can be sent to a corresponding participating node based on areputation degree of the participating node so that the participatingnode can obtain a model gradient matching the participating node, andeach participating node can obtain a federated learning model thatmatches the node.

In the embodiments of this specification, a quantity num_(i) ^((t)) ofmatching gradients allocated to the participating node i in the t^(th)round can be obtained based on

${num}_{i}^{(t)} = {\frac{r_{i}^{(t)}}{\max_{i}r_{i}^{(t)}}{{❘u_{G}^{(t)}❘}.}}$

Here, |u_(G) ^((t))| can represent a total quantity of global gradientfactors included in the global gradient in the t^(th) round. r_(i)^((t))∈(0, 1], and a participating node with the highest reputationdegree can obtain the entire global gradient. In practice, if thecalculated quantity num_(i) ^((t)) of matching gradients includesdecimals, the quantity of matching gradients can also be obtained basedon rounding methods such as rounding, rounding up, and rounding down.

In practice, when participating nodes use local data for model training,since characteristics of the training data used by differentparticipating nodes may have some differences, different participatingnodes may have different degrees of influence on model parameters in aglobal model. Node local gradients provided by the participating nodescan also reflect the needs of the participating nodes. For example, anode local gradient provided by participating node A emphaticallyreflects a relationship between a user age and user needs for products,and may have a greater influence on the user age and the user needs forproducts in the global gradient; a node local gradient provided byparticipating node B emphatically reflects a relationship betweeneducation background of a user and user needs for products, and may havea greater influence on the education background of the user and the userneeds for products in the global gradient; and further, gradient factorsreflecting a user age and user needs for products in the global gradientare fed back to participating node A, and gradient factors reflectingeducation background of the user and user needs for products in theglobal gradient are fed back to participating node B.

To make an actual gradient allocated to the participating nodes more inline with the needs of the participating nodes, in the embodiments ofthis specification, the degree of influence of each gradient factor inthe node local gradient on the global gradient can also be determinedbased on the node local gradients provided by the participating nodes. Aglobal gradient factor with a greater influence in the global gradientis sent to the participating node.

Optionally, in the embodiments of this specification, the globalgradient includes a plurality of global gradient factors, and the nodelocal gradient includes a plurality of local gradient factors; and themethod in the embodiments of this specification can further include thefollowing: obtaining a node influence degree of each global gradientfactor relative to the participating node, where the node influencedegree is used to indicate a degree that each global gradient factor isinfluenced by the participating node; and sorting the global gradientfactors in the global gradient based on the node influence degree, toobtain the sorted global gradient factors; and the selecting globalgradient factors of the quantity of matching gradients from the globalgradient to obtain the actual model gradient can specifically includethe following: selecting global gradient factors of the quantity ofmatching gradients from the sorted global gradient factors based on apredetermined order, to obtain the actual model gradient.

In practice, global gradient factors in the global gradient can besorted based on the node influence degree. Optionally, in theembodiments of this specification, the sorting the global gradientfactors in the global gradient based on the node influence degree canspecifically include the following: sorting the global gradient factorsin the global gradient based on the node influence degree in descendingorder; and the selecting global gradient factors of the quantity ofmatching gradients from the sorted global gradient factors, to obtainthe actual model gradient specifically includes the following: selectingglobal gradient factors of the top quantity of matching gradients fromthe global gradient factors sorted in descending order, to obtain theactual model gradient.

Assume that the global gradient includes four global gradient factors M,N, P, and Q, node influence degrees of participating node A on thesefour global gradient factors are A1, A2, A3, and A4, where A3>A2>A4>A1,and the quantity of matching gradients corresponding to participatingnode A is 3. It can be determined that the global gradient factors N, P,and Q with top 3 node influence degrees are sent to the participatingnode.

In practice, the global gradient factors in the global gradient can alsobe sorted in ascending order based on the node influence degrees, andthe factors with the last rankings are determined as the actual modelgradient of the participating node. The specific sorting method is notlimited here, as long as actual needs can be satisfied.

In the embodiments of this specification, the obtaining a node influencedegree of each global gradient factor relative to the participating nodecan specifically include the following: determining a first distributionparameter corresponding to each local gradient factor in the localgradient of the participating node, where the first distributionparameter is used to indicate a proportion of each local gradient factorin the local gradient; determining a second distribution parametercorresponding to each global gradient factor in the global gradient,where the second parameter is used to indicate a proportion of eachglobal gradient factor in the global gradient; and determining the nodeinfluence degree of each global gradient factor relative to theparticipating node based on the first distribution parameter and thesecond distribution parameter.

In practice, the global gradient factors in the global gradient can besorted in predetermined attribute order, and local gradient factors inthe node local gradient can also be sorted in the predeterminedattribute order.

As an implementation, in the embodiments of this specification, the nodeinfluence degree s_(j) ^((t)) corresponding to the j^(th) node localgradient factor in the node local gradient provided by the participatingnode i in the t^(th) round, and can be expressed as s_(j) ^((t))=s_(i,j)^((t))×s_(G,j) ^((t)), where s_(i,j) ^((t)) represents the firstdistribution parameter, and s_(G,j) ^((t)) represents the seconddistribution parameter.

A first distribution parameter s_(i,j) ^((t)) of the j^(th) node localgradient factor in the node local gradient of the participating node iin the t^(th) round can be expressed as

${s_{i,j}^{(t)} = \frac{u_{i,j}^{(t)}}{\Sigma_{j}u_{i,j}^{(t)}}},$

where u_(i,j) ^((t)) represents the j^(th) node local gradient factor inthe node local gradient of the participating node i in the t^(th) round,and Σ_(j)u_(i,j) ^((t)) represents the sum of node local gradientfactors in node local gradients u_(i) ^((t)) provided by theparticipating node i in the t^(th) round.

A second distribution parameter s_(G,j) ^((t)) of the j^(th) globalgradient factor in the global gradient in the t^(th) round can beexpressed as

${s_{G,j}^{(t)} = \frac{u_{G,j}^{(t)}}{\Sigma_{j}u_{G,j}^{(t)}}},$

where u_(G,j) ^((t)) represents the j^(th) global gradient factor in theglobal gradient in the t^(th) round, and Σ_(j)u_(G,j) ^((t)) representsthe sum of global gradient factors in the global gradient u_(G) ^((t))in the t^(th) round.

In the embodiments of this specification, the marginal loss is used as ameasure of data quality of each participating node in a gradientaggregation process, and gradient aggregation is performed incombination with the data quality and quantity. The embodiments can makethe aggregated model converge faster and have higher accuracy.

Moreover, in the embodiments of this specification, the data quality andcontribution of the participating nodes are comprehensively considered,and a cumulative reputation of the participating nodes is calculated todetermine the quantity of gradients allocated to each participatingnode. When the gradient is selected, the local gradient distribution ofthe participating nodes is also considered so that the participatingnodes can obtain models that match the participating nodes, andparticipating nodes with higher contributions can obtain higher modelaccuracy.

To more clearly illustrate the model gradient determining method basedon federated learning provided in the embodiment of this specification,FIG. 3 is a swimlane diagram illustrating a model gradient determiningmethod based on federated learning according to some embodiments of thisspecification. As shown in FIG. 3 , the method can include a nodetraining phase, a gradient aggregation phase, and a gradient determiningphase, and can specifically include the following:

Step 302: A server sends a basic training model to each participatingnode participating in federated learning.

Step 304: The participating node obtains the basic training model.

Step 306: The participating node trains the basic training model basedon local data to obtain a trained model.

Step 308: The participating node can determine a node local gradientbased on the trained model.

Step 310: The participating node sends, to the server, the node localgradient and data volume information of training data used for trainingthe basic training model based on the local data.

Step 312: The server obtains the node local gradient and the data volumeinformation of the participating node.

In practice, participating node participating in the federated learningcan respectively perform step 304 to step 310, and the server can obtainthe node local gradient and the data volume information of eachparticipating node.

Step 314: After obtaining the node local gradient and the data volumeinformation sent by each participating node participating in thefederated learning, the server can determine, based on the data volumeinformation and the node local gradient, the global gradient of thefederated learning model that the participating node participates in.

In the embodiments of this specification, marginal losses of theparticipating nodes can also be determined based on the node localgradients of the participating nodes, and based on the marginal losses,effective participating nodes and node mass of the participating nodescan be determined from the participating nodes. In the embodiments ofthis specification, the global gradient of the federated learning modelcan be determined based on the node local gradients, the data volumeinformation, and node mass of the effective participating nodes, whichcan effectively improve convergence of the model and improve trainingefficiency.

Step 316: The server can determine a contribution degree of theparticipating node based on the node local gradient and the globalgradient, and determine a reputation degree of the participating nodebased on the contribution degree.

Step 318: The server can determine a quantity of matching gradientscorresponding to the participating node based on the reputation degreeof the participating node.

Step 320: The server determines a node influence degree of each globalgradient factor in the global gradient relative to the participatingnode, sorts the global gradient factors in the global gradient based onthe node influence degree, and obtains the sorted global gradientfactors.

Step 322: The server selects global gradient factors of the quantity ofmatching gradients from the sorted global gradient factors based on apredetermined order, to obtain an actual model gradient.

Step 324: The server sends the determined actual model gradient matchingthe participating node to the participating node.

The server can send the actual model gradient corresponding to eachparticipating node to each participating node, and each participatingnode can generate a federated learning model based on the actual modelgradient that is received by the participating node. For an iterativetraining process in the federated learning, the participating nodes canupdate training models of the participating nodes based on the receivedactual model gradients to obtain the latest version of the trainingmodel, where the latest version of the training model can be consideredas the basic training model. Then the participating nodes can train thelatest version of the training model based on the data about theparticipating nodes, and feed back, to the server for aggregation, thenode local gradient obtained from the training.

Based on the same idea, the embodiments of this specification furtherprovide apparatuses corresponding to the previous methods. FIG. 4 is aschematic structural diagram of a model gradient determining apparatusbased on federated learning according to some embodiments of thisspecification. As shown in FIG. 4 , the apparatus can include thefollowing: a volume information acquisition module 402, configured toobtain data volume information of a participating node, where the datavolume information is used to indicate an amount of data used by theparticipating node to train a basic training model based on local data,and the local data includes user data of a target organizationcorresponding to the participating node; a local gradient acquisitionmodule 404, configured to obtain a node local gradient obtained bytraining the basic training model based on the local data by theparticipating node; a global gradient determining module 406, configuredto determine, based on the data volume information and the node localgradient, a global gradient of a federated learning model that theparticipating node participates in; a degree-of-participationdetermining module 408, configured to determine a degree ofparticipation of the participating node based on the node local gradientof the participating node and the global gradient, where the degree ofparticipation is used to indicate a degree of participation of theparticipating node in federated learning model training; and an actualgradient determining module 410, configured to determine an actual modelgradient of the participating node based on the degree of participation.

Based on the same idea, the embodiments of this specification furtherprovide devices corresponding to the previous methods.

FIG. 5 is a schematic structural diagram of a model gradient determiningdevice based on federated learning according to some embodiments of thisspecification. As shown in FIG. 5 , the device 500 can include thefollowing: at least one processor 510; and a memory 530 communicativelyconnected to the at least one processor.

The memory 530 stores an instruction 520 that can be executed by the atleast one processor 510, and the instruction is executed by the at leastone processor 510 so that the at least one processor 510 can: obtaindata volume information of a participating node, where the data volumeinformation is used to indicate an amount of data used by theparticipating node to train a basic training model based on local data,and the local data includes user data of a target organizationcorresponding to the participating node; obtain a node local gradientobtained by training the basic training model based on the local data bythe participating node; determine, based on the data volume informationand the node local gradient, a global gradient of a federated learningmodel that the participating node participates in; determine a degree ofparticipation of the participating node based on the node local gradientof the participating node and the global gradient, where the degree ofparticipation is used to indicate a degree of participation of theparticipating node in federated learning model training; and determinean actual model gradient of the participating node based on the degreeof participation.

Based on the same idea, the embodiments of this specification furtherprovide a computer-readable medium corresponding to the previousmethods. The computer-readable medium stores a computer-readableinstruction, where the computer-readable instruction can be executed bya processor to implement the previously described model gradientdetermining method based on federated learning.

The embodiments in this specification are described in a progressiveway. For same or similar parts of the embodiments, references can bemade to the embodiments mutually. Each embodiment focuses on adifference from other embodiments. Particularly, the device shown inFIG. 5 is similar to a method embodiment, and therefore is describedbriefly. For related parts, references can be made to relateddescriptions in the method embodiment.

In the 1990 s, whether a technical improvement is a hardware improvement(for example, an improvement to a circuit structure, such as a diode, atransistor, or a switch) or a software improvement (an improvement to amethod procedure) can be clearly distinguished. However, as technologiesdevelop, current improvements to many method procedures can beconsidered as direct improvements to hardware circuit structures. Adesigner usually programs an improved method procedure into a hardwarecircuit, to obtain a corresponding hardware circuit structure.Therefore, a method procedure can be improved by using a hardware entitymodule. For example, a programmable logic device (PLD) (for example, afield programmable gate array (FPGA)) is such an integrated circuit, anda logical function of the PLD is determined by a user through deviceprogramming. The designer performs programming to “integrate” a digitalsystem to a PLD without requesting a chip manufacturer to design andproduce an application-specific integrated circuit (ASIC) chip. Inaddition, at present, instead of manually manufacturing an integratedcircuit chip, this type of programming is mostly implemented by using“logic compiler” software. The software is similar to a softwarecompiler used to develop and write a program. Original code needs to bewritten in a particular programming language for compilation. Thelanguage is referred to as a hardware description language (HDL). Thereare many HDLs, such as the Advanced Boolean Expression Language (ABEL),the Altera Hardware Description Language (AHDL), Confluence, the CornellUniversity Programming Language (CUPL), HDCal, the Java HardwareDescription Language (JHDL), Lava, Lola, MyHDL, PALASM, and the RubyHardware Description Language (RHDL). The very-high-speed integratedcircuit hardware description language (VHDL) and Verilog are mostcommonly used. A person skilled in the art should also understand that ahardware circuit that implements a logical method procedure can bereadily obtained once the method procedure is logically programmed byusing the several described hardware description languages and isprogrammed into an integrated circuit.

A controller can be implemented by using any appropriate methods. Forexample, the controller can be a microprocessor or a processor, or acomputer-readable medium that stores computer-readable program code(such as software or firmware) that can be executed by themicroprocessor or the processor, a logic gate, a switch, anapplication-specific integrated circuit (ASIC), a programmable logiccontroller, or a built-in microprocessor. Examples of the controllerinclude but are not limited to the following microprocessors: ARC 625D,Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320. Thememory controller can also be implemented as a part of the control logicof the memory. A person skilled in the art also knows that, in additionto implementing the controller by using the computer-readable programcode, logic programming can be performed on method steps to allow thecontroller to implement the same function in forms of the logic gate,the switch, the application-specific integrated circuit, theprogrammable logic controller, and the built-in microcontroller.Therefore, the controller can be considered as a hardware component, andan apparatus configured to implement various functions in the controllercan also be considered as a structure in the hardware component. Or theapparatus configured to implement various functions can even beconsidered as both a software module implementing the method and astructure in the hardware component.

The system, apparatus, module, or unit illustrated in the previousembodiments can be implemented by using a computer chip or an entity, orcan be implemented by using a product having a certain function. Atypical implementation device is a computer. Specifically, the computercan be, for example, a personal computer, a laptop computer, a cellularphone, a camera phone, a smartphone, a personal digital assistant, amedia player, a navigation device, an email receiving and sendingdevice, a game console, a tablet computer, a wearable device, or anycombination of these devices.

For ease of description, the apparatus above is described by dividingfunctions into various units. Certainly, when this application isimplemented, a function of each unit can be implemented in one or morepieces of software and/or hardware.

A person skilled in the art should understand that some embodiments ofthis application can be provided as a method, a system, or a computerprogram product. Therefore, this application can use a form of hardwareonly embodiments, software only embodiments, or embodiments with acombination of software and hardware. Moreover, the embodiments of thisapplication can use a form of a computer program product that isimplemented on one or more computer-usable storage media (including butnot limited to a disk memory, a CD-ROM, an optical memory, etc.) thatinclude computer-usable program code.

This application is described with reference to the flowcharts and/orblock diagrams of the method, the device (system), and the computerprogram product based on the embodiments of this application. It isworthwhile to note that computer program instructions can be used toimplement each process and/or each block in the flowcharts and/or theblock diagrams and a combination of a process and/or a block in theflowcharts and/or the block diagrams. These computer programinstructions can be provided for a general-purpose computer, a dedicatedcomputer, an embedded processor, or a processor of another programmabledata processing device to generate a machine so that the instructionsexecuted by the computer or the processor of the another programmabledata processing device generate a device for implementing a specificfunction in one or more processes in the flowcharts and/or in one ormore blocks in the block diagrams.

These computer program instructions can be stored in a computer-readablememory that can instruct the computer or other programmable dataprocessing devices to work in a specific way, so the instructions storedin the computer-readable memory generate an artifact that includes aninstruction apparatus. The instruction apparatus implements a specificfunction in one or more processes in the flowcharts and/or in one ormore blocks in the block diagrams.

These computer program instructions can be loaded onto the computer oranother programmable data processing device so that a series ofoperations and steps are performed on the computer or other programmabledevices, thereby generating computer-implemented processing. Therefore,the instructions executed on the computer or other programmable devicesprovide steps for implementing a specific function in one or moreprocesses in the flowcharts and/or in one or more blocks in the blockdiagrams.

In a typical configuration, a computing device includes one or moreprocessors (CPU), an input/output interface, a network interface, and amemory.

The memory can include a non-persistent memory, a random access memory(RAM), a non-volatile memory, and/or another form that are in acomputer-readable medium, for example, a read-only memory (ROM) or aflash memory (flash RAM). The memory is an example of thecomputer-readable medium.

The computer-readable medium includes persistent, non-persistent,movable, and unmovable media that can store information by using anymethod or technology. The information can be a computer-readableinstruction, a data structure, a program module, or other data. Examplesof the computer storage medium include, but are not limited to, a phasechange random access memory (PRAM), a static RAM (SRAM), a dynamic RAM(DRAM), a RAM of another type, a ROM, an electrically erasableprogrammable ROM (EEPROM), a flash memory or another memory technology,a compact disc ROM (CD-ROM), a digital versatile disc (DVD) or anotheroptical storage, a cassette, and a cassette magnetic disk storage, oranother magnetic storage device or any other non-transmission medium.The computer storage medium can be configured to store information thatcan be accessed by a computing device. As described in thisspecification, the computer-readable medium does not include transitorymedia such as a modulated data signal and a carrier.

It is worthwhile to further note that, the terms “include”, “contain”,or their any other variants are intended to cover a non-exclusiveinclusion, so a process, a method, a product, or a device that includesa list of elements not only includes those elements but also includesother elements which are not expressly listed, or further includeselements inherent to such process, method, product, or device. Withoutmore constraints, an element preceded by “includes a . . . ” does notpreclude the existence of additional identical elements in the process,method, product, or device that includes the element.

A person skilled in the art should understand that the embodiment ofthis application can be provided as a method, a system, or a computerprogram product. Therefore, this application can use a form of hardwareonly embodiments, software only embodiments, or embodiments with acombination of software and hardware. Moreover, this application can usea form of a computer program product that is implemented on one or morecomputer-usable storage media (including but not limited to a diskmemory, a CD-ROM, an optical memory, etc.) that include computer-usableprogram code.

This application can be described in the general context ofcomputer-executable instructions, for example, a program module.Generally, the program module includes a routine, a program, an object,a component, a data structure, etc. executing a specific task orimplementing a specific abstract data type. This application canalternatively be practiced in distributed computing environments inwhich tasks are performed by remote processing devices that areconnected through a communications network. In a distributed computingenvironment, the program module can be located in both local and remotecomputer storage media including storage devices.

The previous descriptions are merely embodiments of this application,and are not intended to limit this application. A person skilled in theart can make various modifications and changes to this application. Anymodification, equivalent replacement, or improvement made withoutdeparting from the spirit and principle of this application shall fallwithin the scope of the claims in this application.

What is claimed is:
 1. A computer-implemented method for model gradientdetermination based on federated learning, comprising: obtaining datavolume information of a participating node, wherein the data volumeinformation indicates an amount of data used by the participating nodeto train, based on local data, a basic training model, and wherein thelocal data comprises user data of a target organization corresponding tothe participating node; obtaining, based on the local data and by theparticipating node, a node local gradient by training the basic trainingmodel; determining, based on the data volume information and the nodelocal gradient, a global gradient of a federated learning model that theparticipating node participates in; determining, based on the node localgradient of the participating node and the global gradient, a degree ofparticipation of the participating node, wherein the degree ofparticipation indicates a degree of participation of the participatingnode in federated learning model training; and determining, based on thedegree of participation, an actual model gradient of the participatingnode.
 2. The computer-implemented method of claim 1, wherein, after theobtaining a node local gradient obtained by training the basic trainingmodel based on the local data by the participating node: obtaining amarginal loss of the participating node, wherein the marginal lossrepresents a degree of influence of the node local gradient of theparticipating node on performance of the federated learning model; anddetermining, based on the marginal loss, node mass of the participatingnode.
 3. The computer-implemented method of claim 2, whereindetermining, based on the data volume information and the node localgradient, a global gradient of a federated learning model that theparticipating node participates in, comprises: determining, based on thedata volume information, the node local gradient, and the node mass, theglobal gradient of the federated learning model that the participatingnode participates in.
 4. The computer-implemented method of claim 3,wherein the participating node participating in the federated learningmodel comprises a plurality of participating nodes.
 5. Thecomputer-implemented method of claim 4, wherein obtaining a marginalloss of the participating node, comprises: determining, based on a nodelocal gradient of each participating node in the plurality ofparticipating nodes, a first reference global model; determining, basedon a node local gradient of each participating node other than theparticipating node in the plurality of participating nodes, a secondreference global model; determining, based on a predeterminedverification set, a first model loss of the first reference globalmodel; determining, based on the predetermined verification set, asecond model loss of the second reference global model; and determining,based on the first model loss and the second model loss, the marginalloss of the participating node.
 6. The computer-implemented method ofclaim 3, wherein the participating node participating in the federatedlearning model comprises a plurality of participating nodes.
 7. Thecomputer-implemented method of claim 6, wherein the determining, basedon the marginal loss, node mass of the participating node, comprises:determining, based on a marginal loss of each participating node in theplurality of participating nodes and a normalization algorithm, the nodemass of the participating node.
 8. The computer-implemented method ofclaim 3, wherein the participating node participating in the federatedlearning model comprises a plurality of participating nodes; and,comprising: determining a participating node with a marginal lossgreater than or equal to a predetermined loss threshold in the pluralityof participating nodes as an effective participating node.
 9. Thecomputer-implemented method of claim 8, wherein determining, based onthe data volume information, the node local gradient, and the node mass,the global gradient of the federated learning model that theparticipating node participates in, comprises: performing, based on datavolume information of the participating node and node mass of theparticipating node and to obtain the global gradient, an aggregationoperation on a node local gradient of each of the effectiveparticipating node.
 10. The computer-implemented method of claim 1,wherein the participating node participating in the federated learningmodel comprises a plurality of participating nodes.
 11. Thecomputer-implemented method of claim 10, wherein the determining adegree of participation of the participating node based on the nodelocal gradient of the participating node and the global gradient,comprises: determining a node contribution degree of each of theplurality of participating nodes based on the node local gradient of theparticipating node and the global gradient; and determining a relativecontribution degree of the participating node based on a nodecontribution degree of the participating node and the node contributiondegree of each participating node.
 12. The computer-implemented methodof claim 11, wherein the global gradient comprises a gradient obtainedthrough a target quantity of rounds of iterative calculations; and,comprising: obtaining a trustworthiness parameter of the participatingnode, wherein the trustworthiness parameter represents a comprehensivedegree of reliability of the participating node in the target quantityof rounds of iterative calculations of the global gradient; anddetermining a reputation degree of the participating node based on therelative contribution degree and the trustworthiness parameter.
 13. Thecomputer-implemented method of claim 12, wherein obtaining atrustworthiness parameter of the participating node, comprises:determining a first quantity of times that the participating node isdetermined as an effective participating node in the target quantity ofrounds of iterative calculations, wherein the effective participatingnode represents a participating node with a marginal loss greater thanor equal to a predetermined loss threshold; determining a secondquantity of times that the participating node is determined as anineffective participating node in the target quantity of rounds ofiterative calculations, wherein the ineffective participating noderepresents a participating node with a marginal loss less than thepredetermined loss threshold; and determining the trustworthinessparameter of the participating node based on the first quantity of timesand the second quantity of times.
 14. The computer-implemented method ofclaim 12, wherein the global gradient comprises a plurality of globalgradient factors, wherein the participating node participating in thefederated learning model comprises a plurality of participating nodes,and wherein the global gradient comprises the gradient obtained throughthe target quantity of rounds of iterative calculations.
 15. Thecomputer-implemented method of claim 14, wherein the determining anactual model gradient of the participating node based on the degree ofparticipation, comprises: determining a quantity of matching gradientscorresponding to the participating node based on a ratio of thereputation degree of the participating node to a greatest reputationdegree, wherein the greatest reputation degree represents the greatestreputation degree in reputation degrees of the plurality ofparticipating nodes; and selecting global gradient factors of thequantity of matching gradients from the global gradient to obtain theactual model gradient.
 16. The computer-implemented method of claim 15,wherein the global gradient comprises a plurality of global gradientfactors, wherein the node local gradient of the participating nodecomprises a plurality of local gradient factors; and, comprising:obtaining a node influence degree of each global gradient factorrelative to the participating node, wherein the node influence degree ofeach global gradient factor indicates a degree that each global gradientfactor is influenced by the participating node; and sorting, based onthe node influence degree of each global gradient factor and to obtainsorted global gradient factors, the global gradient factors in theglobal gradient.
 17. The computer-implemented method of claim 16,wherein selecting global gradient factors of the quantity of matchinggradients from the global gradient to obtain the actual model gradient,comprises: selecting, based on a predetermined order and to obtain theactual model gradient, global gradient factors of the quantity ofmatching gradients from the sorted global gradient factors.
 18. Thecomputer-implemented method of claim 17, wherein obtaining a nodeinfluence degree of each global gradient factor relative to theparticipating node, comprises: determining a first distributionparameter corresponding to each local gradient factor in the node localgradient of the participating node, wherein the first distributionparameter indicates a proportion of each local gradient factor in thenode local gradient of the participating node; determining a seconddistribution parameter corresponding to each global gradient factor inthe global gradient, wherein the second distribution parameter indicatesa proportion of each global gradient factor in the global gradient; anddetermining, based on the first distribution parameter and the seconddistribution parameter, the node influence degree of each globalgradient factor relative to the participating node.
 19. Anon-transitory, computer-readable medium storing one or moreinstructions executable by a computer system to perform operations,comprising: obtaining data volume information of a participating node,wherein the data volume information indicates an amount of data used bythe participating node to train, based on local data, a basic trainingmodel, and wherein the local data comprises user data of a targetorganization corresponding to the participating node; obtaining, basedon the local data and by the participating node, a node local gradientby training the basic training model; determining, based on the datavolume information and the node local gradient, a global gradient of afederated learning model that the participating node participates in;determining, based on the node local gradient of the participating nodeand the global gradient, a degree of participation of the participatingnode, wherein the degree of participation indicates a degree ofparticipation of the participating node in federated learning modeltraining; and determining, based on the degree of participation, anactual model gradient of the participating node.
 20. Acomputer-implemented system, comprising: one or more computers; and oneor more computer memory devices interoperably coupled with the one ormore computers and having tangible, non-transitory, machine-readablemedia storing one or more instructions that, when executed by the one ormore computers, perform one or more operations, comprising: obtainingdata volume information of a participating node, wherein the data volumeinformation indicates an amount of data used by the participating nodeto train, based on local data, a basic training model, and wherein thelocal data comprises user data of a target organization corresponding tothe participating node; obtaining, based on the local data and by theparticipating node, a node local gradient by training the basic trainingmodel; determining, based on the data volume information and the nodelocal gradient, a global gradient of a federated learning model that theparticipating node participates in; determining, based on the node localgradient of the participating node and the global gradient, a degree ofparticipation of the participating node, wherein the degree ofparticipation indicates a degree of participation of the participatingnode in federated learning model training; and determining, based on thedegree of participation, an actual model gradient of the participatingnode.